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1  Introduction 

Many  applications  call  for  a  rapid  evaluation  of  discrete  convolutions  of  the  form 

(f*g)(n)  =  5^/(k)s(n-k)  =  /(n  -  k)s(k)  (1) 

keZ71  kez™ 

where  /  and  g  are  real  or  complex  functions  on  a  regular  lattice  Zn.  We  shall  be  dealing  primarily 
with  the  most  interesting  and  challenging  3D  case  (n  =  3). 

Applications  where  convolutions  (1)  arise  include  (a)  digital  signal/image  processing/  filtering 
by  time- invariant  (translation-invariant)  systems  and  (b)  finite  difference  analysis  of  boundary  value 
problems.  In  the  latter  category,  two  related  methodologies  are  particularly  noteworthy.  The  first 
one,  put  forward  and  thoroughly  studied  by  Ryaben’kii,  Tsynkov  and  others  [1,  2],  is  known  as  the 
method  of  difference  potentials  and  can  be  viewed  as  a  discrete  analog  of  the  Calderon  projection 
operators.  The  second  one,  boundary  algebraic  equations  (BAE),  is  at  least  50  years  old  (Saltzer 
[16])  and  is  a  discrete  analog  of  first-  or  second-order  Fredholm  boundary  integral  equations  for 
potential  problems  (Martinsson  &  Rodin  [15]). 

The  primary  objective  of  this  work  is  to  develop  a  Fast  Multipole  Method  on  a  Lattice  (FMML) 
for  computing  convolutions  (1).  The  “source  nodes”  where  /  is  nonzero  are  assumed  to  be  confined 
to  a  finite  box  containing  a  fragment  of  the  lattice;  this  box  is  recursively  subdivided  into  a  nested 
sequence  of  smaller  boxes  that  form  a  tree  structure  (quadtree  in  2D,  as  each  box  gets  subdivided 
into  four  children.)  In  contrast,  g  does  not  have  to  have  a  bounded  support;  in  boundary  value 
problems,  it  will  represent  discrete  Green’s  function. 

The  general  principles  of  FMML  are  the  same  as  in  the  well-known  continuous  case  [3]:  “far- 
field”  contributions  to  the  potential  due  to  remote  sources  are  computed  via  a  multipole  expansion, 
while  the  near  field  is  computed  directly.  The  multipole  moments  are  computed  in  the  course  of  the 
upward  pass  over  the  hierarchical  structure  of  boxes.  Then  these  multipole  moments  get  converted 
into  local  held  expansions  in  the  course  of  the  downward  pass  from  the  root  box  to  the  finest  level. 

The  ingredients  of  this  computational  procedure  are  worked  out  below  and  include  (i)  the 
multipole  expansion;  (ii)  far- held-to- far-held  translation  (needed  in  the  upward  pass);  (iii)  far-held- 
to- local- held  and  local-to- local- held  translation  formulas  (needed  for  the  downward  pass). 

A  remarkable  feature  of  the  resultant  algorithm  is  its  universality:  it  can  be  applied  to  any 
given  convolution  kernel  g  (discrete  Green’s  function  in  applications  to  boundary  value  problems), 
provided  that  it  is  translation-invariant  and  that  its  hnite  differences  on  the  lattice  decay  rapidly 
enough  for  the  far- held  multipole  expansion  on  the  lattice  to  be  convergent. 

*The  work  of  this  author  was  supported  by  the  US  Army  Research  office,  grant  #  W911NF-11-1-0384. 
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Several  “kernel-independent”  (“black-box”)  multipole  techniques  have  already  been  proposed. 
In  the  method  due  to  Ying  et  al.  [5],  the  multipole  expansions  of  the  kernel  are  replaced  with  a 
continuous  distribution  of  an  equivalent  source  density  on  a  surface  enclosing  a  given  box.  This 
equivalent  density  is  found  by  matching  its  potential  to  the  potential  of  the  original  sources  at  a 
surface  in  the  far  field.  This  approach  builds  up  on  the  earlier  ideas  of  Anderson  [7]  and  Makino  [6]: 
the  use  of  surface  potentials  or,  alternatively,  potentials  due  to  fictitious  “pseudoparticles”  instead  of 
the  multipole  expansion.  Fong  &  Darve  combined  continuous  SVD  with  low-rank  approximation  of 
the  kernel  by  Chebyshev  polynomials  (see  [4]  and  references  therein)  to  produce  a  FMM.  Martinsson 
[9]  uses  polyharmonic  approximations  of  the  kernel  to  construct  a  fast  multipole  method. 

2  The  Fast  Multipole  Method  on  a  Lattice 

The  fast  multipole  algorithms  (FMA)  are  usually  frequency  dependent.  If  the  dimension  of  the 
object  is  above  0.5  wavelengths,  MLFMA  (Multi-level  FMA)  works  properly,  since  it  uses  the 
dominant  propagating  wave  in  the  translation  process  [10].  But  if  the  object  size  is  below  0.1 
wavelengths,  LFFMA  (Low  Frequency  FMA)  is  employed  [11].  In  between,  MFFMA  (Mixed- form 
FMA)  is  used  to  analytically  connect  multipole  and  plane  wave  expansions  [12]. 

The  most  important  step  is  that  the  source-field  interaction  kernel  or  the  Green’s  function 
G(rji)  must  be  cast  into  the  following  translation  form: 

G(rji )  =  ■  P j1j2(y jlj2)  •  aj2/2(rj2/2)  •  Pl2h{rhIl)  ■  Phi( rhi)  (2) 

where 

rji  =  r jJx  +  r  JlJ2  +  r  j2/2  +  rhIl  +  rhi  (3) 

as  plotted  in  Fig.  1.  Pjj]  and  Pr]t  are  column  vectors  called  receiving  and  radiation  patterns,  respec¬ 
tively.  fij1j2  and  are  matrices  (dense  for  low  frequencies  and  diagonal  for  mid- frequencies) 
converting  incoming  and  outgoing  waves  between  parent  boxes  and  child  boxes.  They  could  be 
considered  as  far-field-to-far-field  or  local-field-to-local-field  transformations.  dj2/2  is  an  outgoing- 
wave  to  incoming- wave  translator  (021  translator),  or  far- field-to- local-field  translator.  There  are 
316  021  translators  for  3D  problems.  But  each  time  only  outgoing  waves  from  nonempty  boxes  in 
the  interaction  list  are  translated.  Mathematically,  the  addition  theorem  is  used  to  generate  this 
translation  format.  Many  FMA  performance  improvements  can  be  realized  by  working  out  novel 
translators. 


Figure  1:  FMA  decomposition  of  the  vector  r;j.  2D  rendition  for  simplicity. 
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For  the  scattering  problem,  the  object  size  is  usually  comparable  to  the  wavelength.  Hence, 
MLFMA  is  employed.  The  free  space  Green’s  function  can  be  decomposed  into  the  integration  of 
plane  waves  to  all  directions.  From  the  addition  theorem  [13],  the  free  space  Green’s  function  can 
be  generally  represented  by 


gifc|D+d| 
47t|D  +  d 


Y  +  1  )ji{kd)hf\kD)Pi{d  •  D) 

71  1=0 


(4) 


where  ji(x)  is  the  spherical  Bessel  function  of  the  first  kind,  h\L\x)  is  the  spherical  Hankel  function 
of  the  first  kind,  Pi(x)  is  the  Legendre  polynomial,  and  d  <  D.  It  is  further  derived  into  a 
diagonal  translation  format  by  introducing  a  plane  wave  integration,  truncating  the  summation, 
and  exchanging  the  order  of  integration  and  summation 


eiA:|D+d| 
47t|D  +  d 


ik 

47T 


dHeikdaL(H,D) 


(5) 


where  f  dfl  represents  the  integral  over  a  unit  sphere.  The  far-held-to-local-field  translator  cxl(£1,  D) 
is  defined  as 


aL(ft,D)  =  ±jriil(2l  +  l)hi1\kD)Pl{k-D)  (6) 

n  1=0 

The  term  e*kd  is  a  combination  of  far-field-to-far-field  transformations,  local-field-to-local-field 
transformations,  radiation  patterns  and  receiving  patterns.  In  the  conventional  BEM,  FMA  was 
applied  to  sources  represented  by  RWG  basis  over  triangular  patches  [14]. 

P(k)  =  [  eik  r' S(r') dr'  (7) 

Js 

On  a  lattice,  the  sources  are  all  point  sources.  Hence,  the  radiation  pattern  setup  becomes  much 
simpler. 


Discretization  of  the  integral  in  (5)  results  in  a  diagonal  translation  form.  However,  because  the 
Green’s  function  is  decomposed  into  plane  waves,  essentially  we  are  not  using  multipoles  except  for 
the  translator,  k  is  sampled  over  a  unit  sphere  for  all  angles.  The  list  of  P(k)  forms  the  vector  of 
(8)' 

Because  the  size  of  the  box  doubles  when  transformations  are  performed  from  a  lower  level  of 
the  FMA  oct-tree  to  the  parent  level,  the  bandwidth  increases  correspondingly.  Hence,  denser  data 
sampling  is  needed  at  parent  levels.  To  avoid  unnecessary  memory  at  lower  levels,  interpolation 
schemes  are  used  during  the  upward  tree  browsing  process.  Hence,  (2)  is  modified  by  adding  the 
interpolation  matrix  / 

G(rji )  =  •  f  •  PjlJa(TjlJa)  •  aj2/2(rj2/2)  •  Phh(rhIl)  ■  I  •  f3hi( rhi)  (9) 

The  error  of  this  mid-frequency  MLFMA  is  controlled  by  the  truncation  number  L  in  the  translator 
and  the  buffer  zone.  Because  of  the  computer’s  limited  numerical  precision,  L  cannot  be  infinitely 
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large.  A  lot  of  work  has  been  done  to  refine  the  estimation  of  L  for  the  best  possible  accuracy.  A 
very  nice  estimation  is  using  the  following  excess  bandwidth  equation: 

L  ~  kd+1.8d20/3(kd)1/3  (10) 

where  do  =  log(l/e).  This  method  is  very  effective  for  mid- frequency  applications  since  the  accuracy 
could  reach  10“5  easily  when  the  box  size  is  large  compared  to  the  wavelength.  However,  when  the 
frequency  is  low,  even  the  best  excess  bandwidth  equation  could  not  improve  the  error  too  much. 

The  near  field  interaction  is  computed  directly.  The  conventional  MoM  method  could  be  applied 
to  this  part  to  collect  the  contributions  from  near  neighbors. 


3  The  Lattice  Green  Function  for  the  Scalar  Wave  Equation  in 
3D 

The  linear  scalar  wave  equation  (the  Helmholtz  equation)  in  3D  is 

V2u  +  k2u  =  /,  (11) 

where  k  is  a  given  parameter  (the  wavenumber),  /  (the  source)  is  a  given  function  of  coordinates, 
and  u  is  an  unknown  function  to  be  found. 

The  Green  function  G(r)  for  (11)  is,  by  definition,  generated  by  the  delta-source  /  =  <5(r);  i.e. 

V2G  +  k2G  =  S{  r),  (12) 

G  has  the  well  known  analytical  form 

G(r)  -  «  (13) 

47rr 

We  now  consider  discrete  analogs  of  (11),  (13).  Let  (12)  be  approximated  by  the  standard  seven- 
point  flux-balance  scheme  on  a  uniform  grid  with  a  size  h  (for  simplicity,  the  same  in  all  three 
directions): 


(6  k  h  ) Qix-\-l,iy,iz  9ix—l,iy,iz  9ix,iy-\-l,iz  9ix,iy—l,iz  9ix,iy,iz-\- 1 


yix,iy,iz—l 


where  g  is  the  lattice  Green  function  (LGF),  ix,iy,iz  are  integer  indexes,  and  6  in  this  expression 
is  the  discrete  delta- function  (the  Kronecker  delta  in  3D). 

There  are  at  least  two  general  ways  to  compute  the  LGF:  Fourier  analysis  and  finite  difference 
solutions.  A  detailed  exposition  for  the  Laplace  equation  has  been  given  by  Martinsson  Sz  Rodin 
[9,  8,  15];  for  the  wave  problem,  see  [17]. 

Fourier  analysis  is  quite  involved  and  must  be  performed  with  great  care.  For  our  purposes,  a 
more  straightforward  route  is  sufficient.  The  finite  difference  problem  (14)  for  the  Green  function 
can  be  solved  directly,  with  the  continuous  Green  function  imposed  as  a  boundary  condition  on  a 
large  enough  cube  [— M,  M]3.  Such  a  direct  solution  can  be  efficiently  obtained  via  FFT,  as  done 
for  the  Laplace  equation  by  Kansa  et  al.  [18]. 

An  example  of  LGF  for  A  =  1  (k  =  27 r)  and  h  =  0.1  is  shown  in  Fig.  3  (ID  plot)  and  Fig.  3 
(surface  plot). 
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Figure  2:  a  ID  plot  (y  =  z  =  0,  x  varies)  of  the  LGF  for  A  =  1,  h  =  0.1. 
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4  Conclusion 


The  proposed  Fast  Multipole  Method  on  a  lattice  has  the  following  salient  advantages: 

•  Close-to-optimal  (apart  from  the  logarithmic  factors)  asymptotic  number  of  operations  with 
respect  to  the  number  of  grid  nodes. 

•  In  applications  to  finite  difference  analysis  of  boundary  value  problems,  the  method  paves  the 
way  for  the  efficient  solution  of  boundary  algebraic  equations  (Martinsson  &  Rodin  [9,  15]]), 
or,  alternatively,  equations  of  difference  potentials  -  a  discrete  analog  of  Calderon  projections 
(Ryaben’kii  Sz  Tsynkov  [1,  2]).  The  latter  method  and  its  computational  efficiency  were 
recently  analyzed  in  detail  for  problems  involving  exterior  magnetic  fields  of  fusion  devices 
[18]- 
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